dark data [English]


InterPARES Definition

n. ~ Information typically collected during routine business operations for regulatory compliance, rather than for analysis or to support business decisions.

Other Definitions

  • Gartner IT Glossary (†298 s.v. "dark data"): The information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value.
  • Gartner IT Glossary (†298 s.v. "dark data"): the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value.

Citations

  • Misra 2014 (†370 ): In a Q&A, JPL data scientist Rob Witoff, explained how they turn the terabytes upon terabytes of data they work with into visualizations, saying "Our most impactful visuals have been to showcase the existence of information in data that hadn't previously been considered. We think of this as illuminating 'Dark Data'." (†374)
  • Techopedia (†411 s.v. "dark data"): Dark data is a type of unstructured, untagged and untapped data that is found in data repositories and has not been analyzed or processed. It is similar to big data but differs in how it is mostly neglected by business and IT administrators in terms of its value. ¶ Dark data is also known as dusty data. ¶ Dark data is data that is found in log files and data archives stored within large enterprise class data storage locations. It includes all data objects and types that have yet to be analyzed for any business or competitive intelligence or aid in business decision making. Typically, dark data is complex to analyze and stored in locations where analysis is difficult. The overall process can be costly. It also can include data objects that have not been seized by the enterprise or data that are external to the organization, such as data stored by partners or customers. ¶ IDC, a research firm, stated that up to 90 percent of big data is dark data. (†2703)
  • Wikipedia (†387 s.v. "dark data"): Data which is acquired through various computer network operations but not used in any manner to derive insights or for decision making. In some cases the organization may not even be aware that the data is being collected. . . . In an industrial context, dark data can include information gathered by sensors and telematics. . . . Often it is stored for regulatory compliance and record keeping. Some organizations believe that dark data could be useful to them in the future, once they have acquired better analytic and business intelligence technology to process the information. (†2701)